6 research outputs found
Learning Fast and Slow: PROPEDEUTICA for Real-time Malware Detection
In this paper, we introduce and evaluate PROPEDEUTICA, a novel methodology
and framework for efficient and effective real-time malware detection,
leveraging the best of conventional machine learning (ML) and deep learning
(DL) algorithms. In PROPEDEUTICA, all software processes in the system start
execution subjected to a conventional ML detector for fast classification. If a
piece of software receives a borderline classification, it is subjected to
further analysis via more performance expensive and more accurate DL methods,
via our newly proposed DL algorithm DEEPMALWARE. Further, we introduce delays
to the execution of software subjected to deep learning analysis as a way to
"buy time" for DL analysis and to rate-limit the impact of possible malware in
the system. We evaluated PROPEDEUTICA with a set of 9,115 malware samples and
877 commonly used benign software samples from various categories for the
Windows OS. Our results show that the false positive rate for conventional ML
methods can reach 20%, and for modern DL methods it is usually below 6%.
However, the classification time for DL can be 100X longer than conventional ML
methods. PROPEDEUTICA improved the detection F1-score from 77.54% (conventional
ML method) to 90.25%, and reduced the detection time by 54.86%. Further, the
percentage of software subjected to DL analysis was approximately 40% on
average. Further, the application of delays in software subjected to ML reduced
the detection time by approximately 10%. Finally, we found and discussed a
discrepancy between the detection accuracy offline (analysis after all traces
are collected) and on-the-fly (analysis in tandem with trace collection). Our
insights show that conventional ML and modern DL-based malware detectors in
isolation cannot meet the needs of efficient and effective malware detection:
high accuracy, low false positive rate, and short classification time.Comment: 17 pages, 7 figure
A Praise for Defensive Programming: Leveraging Uncertainty for Effective Malware Mitigation
A promising avenue for improving the effectiveness of behavioral-based
malware detectors would be to combine fast traditional machine learning
detectors with high-accuracy, but time-consuming deep learning models. The main
idea would be to place software receiving borderline classifications by
traditional machine learning methods in an environment where uncertainty is
added, while software is analyzed by more time-consuming deep learning models.
The goal of uncertainty would be to rate-limit actions of potential malware
during the time consuming deep analysis. In this paper, we present a detailed
description of the analysis and implementation of CHAMELEON, a framework for
realizing this uncertain environment for Linux. CHAMELEON offers two
environments for software: (i) standard - for any software identified as benign
by conventional machine learning methods and (ii) uncertain - for software
receiving borderline classifications when analyzed by these conventional
machine learning methods. The uncertain environment adds obstacles to software
execution through random perturbations applied probabilistically on selected
system calls. We evaluated CHAMELEON with 113 applications and 100 malware
samples for Linux. Our results showed that at threshold 10%, intrusive and
non-intrusive strategies caused approximately 65% of malware to fail
accomplishing their tasks, while approximately 30% of the analyzed benign
software to meet with various levels of disruption. With a dynamic, per-system
call threshold, CHAMELEON caused 92% of the malware to fail, and only 10% of
the benign software to be disrupted. We also found that I/O-bound software was
three times more affected by uncertainty than CPU-bound software. Further, we
analyzed the logs of software crashed with non-intrusive strategies, and found
that some crashes are due to the software bugs
"Vanilla" malware : vanishing antiviruses by interleaving layers and layers of attacks
Malware are persistent threats to any networked systems. Recent years increase in multi-core, distributed systems created new opportunities for malware authors to exploit such capabilities. In particular, the distributed execution of a malware in multiple cores may be used to evade currently widespread single-core-based detectors (e.g., antiviruses, or AVs) and malware analysis solutions that are unable to correlate data from multiple sources. In this paper, we propose a technique for distributing the malware functions in several distinct "vanilla" processes to show that AVs can be easily evaded. Therefore, our technique allows malware to interleave of layers of attacks to remain undetected by current AVs. Our goal is to expose a real menace and to discuss it so as to provide insights for the development of better AVs. We discuss the role of distributed and multicore-based malware in current and future threat scenarios with practical examples that we specially crafted for testing (e.g., a distributed sample synchronized via cache side channels). We (i) review multi-threaded/processed implementation issues (from kernel and userland) and present a multi-core-based monitoring solution; (ii) present strategies for code distribution, exemplified via DLL injectors, and discuss their weak and strong points; and (iii) evaluate how real security solutions perform when exposed to distributed malware. We converted real, serial malware to parallel code and showed that current AVs are not fully able to detect multi-core malware154233247CAPES - Coordenação de Aperfeiçoamento de Pessoal e NÃvel SuperiorCNPQ - Conselho Nacional de Desenvolvimento CientÃfico e Tecnológico24/2014; 23038.007604/2014-69164745/2017-
Leveraging branch traces to understand kernel internals from within
Kernel monitoring is often a hard task, requiring external debuggers and/or modules to be successfully performed. These requirements make analysis procedures more complicated because multiple machines, although virtualized ones, are required. This requirements also make analysis procedures more expensive. In this paper, we present the Lightweight Kernel Tracer (LKT), an alternative solution for tracing kernel from within by leveraging branch monitors for data collection and an address-based introspection procedure for context reconstruction. We evaluated LKT by tracing distinct machines powered by x64 Windows kernels and show that LKT may be used for understanding kernel's internals (e.g., graphics and USB subsystems) and for system profiling. We also show how to use LKT to trace other tracing and monitoring mechanisms running in kernel, such as Antiviruses and SandboxesCAPES - Coordenação de Aperfeiçoamento de Pessoal e NÃvel SuperiorCNPQ - Conselho Nacional de Desenvolvimento CientÃfico e Tecnológico24/2014; 23038.007604/2014-69164745/2017-